Overview

Dataset statistics

Number of variables34
Number of observations610895
Missing cells3727223
Missing cells (%)17.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory730.6 MiB
Average record size in memory1.2 KiB

Variable types

Categorical19
Numeric13
Boolean1
Unsupported1

Alerts

Filed Online has constant value "True"Constant
ESNCAG - Boundary File has constant value "1.0"Constant
Central Market/Tenderloin Boundary Polygon - Updated has constant value "1.0"Constant
Civic Center Harm Reduction Project Boundary has constant value "1.0"Constant
Incident Datetime has a high cardinality: 291613 distinct valuesHigh cardinality
Incident Date has a high cardinality: 1750 distinct valuesHigh cardinality
Incident Time has a high cardinality: 1440 distinct valuesHigh cardinality
Report Datetime has a high cardinality: 438062 distinct valuesHigh cardinality
Incident Subcategory has a high cardinality: 71 distinct valuesHigh cardinality
Incident Description has a high cardinality: 829 distinct valuesHigh cardinality
Intersection has a high cardinality: 6373 distinct valuesHigh cardinality
Point has a high cardinality: 6460 distinct valuesHigh cardinality
Incident Year is highly overall correlated with Row ID and 3 other fieldsHigh correlation
Row ID is highly overall correlated with Incident Year and 3 other fieldsHigh correlation
Incident ID is highly overall correlated with Incident Year and 3 other fieldsHigh correlation
Incident Number is highly overall correlated with Incident Year and 3 other fieldsHigh correlation
CAD Number is highly overall correlated with Incident Year and 3 other fieldsHigh correlation
Incident Code is highly overall correlated with Incident Category and 1 other fieldsHigh correlation
CNN is highly overall correlated with Supervisor District and 1 other fieldsHigh correlation
Supervisor District is highly overall correlated with CNN and 4 other fieldsHigh correlation
Latitude is highly overall correlated with Supervisor District and 2 other fieldsHigh correlation
Longitude is highly overall correlated with Current Police Districts and 2 other fieldsHigh correlation
Neighborhoods is highly overall correlated with Police District and 2 other fieldsHigh correlation
Current Supervisor Districts is highly overall correlated with Police District and 2 other fieldsHigh correlation
Current Police Districts is highly overall correlated with Longitude and 3 other fieldsHigh correlation
Report Type Code is highly overall correlated with Report Type Description and 2 other fieldsHigh correlation
Report Type Description is highly overall correlated with Report Type Code and 2 other fieldsHigh correlation
Incident Category is highly overall correlated with Incident Code and 3 other fieldsHigh correlation
Incident Subcategory is highly overall correlated with Incident Code and 3 other fieldsHigh correlation
Police District is highly overall correlated with Supervisor District and 5 other fieldsHigh correlation
Analysis Neighborhood is highly overall correlated with CNN and 8 other fieldsHigh correlation
HSOC Zones as of 2018-06-05 is highly overall correlated with Supervisor District and 7 other fieldsHigh correlation
Resolution is highly imbalanced (60.8%)Imbalance
CAD Number has 137235 (22.5%) missing valuesMissing
Filed Online has 486720 (79.7%) missing valuesMissing
Intersection has 32624 (5.3%) missing valuesMissing
CNN has 32624 (5.3%) missing valuesMissing
Analysis Neighborhood has 32738 (5.4%) missing valuesMissing
Supervisor District has 32624 (5.3%) missing valuesMissing
Latitude has 32624 (5.3%) missing valuesMissing
Longitude has 32624 (5.3%) missing valuesMissing
Point has 32624 (5.3%) missing valuesMissing
Neighborhoods has 45029 (7.4%) missing valuesMissing
ESNCAG - Boundary File has 604133 (98.9%) missing valuesMissing
Central Market/Tenderloin Boundary Polygon - Updated has 532645 (87.2%) missing valuesMissing
Civic Center Harm Reduction Project Boundary has 532929 (87.2%) missing valuesMissing
HSOC Zones as of 2018-06-05 has 482105 (78.9%) missing valuesMissing
Invest In Neighborhoods (IIN) Areas has 610895 (100.0%) missing valuesMissing
Current Supervisor Districts has 32728 (5.4%) missing valuesMissing
Current Police Districts has 33332 (5.5%) missing valuesMissing
CAD Number is highly skewed (γ1 = 21.75604076)Skewed
Report Datetime is uniformly distributedUniform
Invest In Neighborhoods (IIN) Areas is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-04-20 08:17:46.703200
Analysis finished2023-04-20 08:19:34.841548
Duration1 minute and 48.14 seconds
Download configurationconfig.json

Variables

Distinct291613
Distinct (%)47.7%
Missing0
Missing (%)0.0%
Memory size42.5 MiB
23-11-2021 13:00
 
96
01-01-2018 00:00
 
74
01-01-2019 00:00
 
67
19-04-2022 03:30
 
60
01-01-2021 00:00
 
53
Other values (291608)
610545 

Length

Max length16
Median length16
Mean length16
Min length16

Characters and Unicode

Total characters9774320
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique168278 ?
Unique (%)27.5%

Sample

1st row25-07-2021 00:00
2nd row28-06-2022 23:58
3rd row11-03-2022 10:30
4th row15-05-2021 17:47
5th row28-06-2022 17:22

Common Values

ValueCountFrequency (%)
23-11-2021 13:00 96
 
< 0.1%
01-01-2018 00:00 74
 
< 0.1%
01-01-2019 00:00 67
 
< 0.1%
19-04-2022 03:30 60
 
< 0.1%
01-01-2021 00:00 53
 
< 0.1%
01-01-2020 00:00 52
 
< 0.1%
01-02-2018 00:00 48
 
< 0.1%
01-08-2018 00:00 47
 
< 0.1%
01-04-2019 00:00 47
 
< 0.1%
01-01-2022 00:00 46
 
< 0.1%
Other values (291603) 610305
99.9%

Length

2023-04-20T13:49:34.948533image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
00:00 17615
 
1.4%
12:00 16484
 
1.3%
18:00 12923
 
1.1%
17:00 11808
 
1.0%
20:00 11634
 
1.0%
19:00 11349
 
0.9%
15:00 9991
 
0.8%
21:00 9898
 
0.8%
16:00 9775
 
0.8%
22:00 9708
 
0.8%
Other values (3180) 1100605
90.1%

Most occurring characters

ValueCountFrequency (%)
0 2315116
23.7%
2 1630849
16.7%
1 1437821
14.7%
- 1221790
12.5%
610895
 
6.2%
: 610895
 
6.2%
8 352716
 
3.6%
3 349986
 
3.6%
9 331938
 
3.4%
5 292612
 
3.0%
Other values (3) 619702
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7330740
75.0%
Dash Punctuation 1221790
 
12.5%
Space Separator 610895
 
6.2%
Other Punctuation 610895
 
6.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2315116
31.6%
2 1630849
22.2%
1 1437821
19.6%
8 352716
 
4.8%
3 349986
 
4.8%
9 331938
 
4.5%
5 292612
 
4.0%
4 247672
 
3.4%
7 191734
 
2.6%
6 180296
 
2.5%
Dash Punctuation
ValueCountFrequency (%)
- 1221790
100.0%
Space Separator
ValueCountFrequency (%)
610895
100.0%
Other Punctuation
ValueCountFrequency (%)
: 610895
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9774320
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2315116
23.7%
2 1630849
16.7%
1 1437821
14.7%
- 1221790
12.5%
610895
 
6.2%
: 610895
 
6.2%
8 352716
 
3.6%
3 349986
 
3.6%
9 331938
 
3.4%
5 292612
 
3.0%
Other values (3) 619702
 
6.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9774320
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2315116
23.7%
2 1630849
16.7%
1 1437821
14.7%
- 1221790
12.5%
610895
 
6.2%
: 610895
 
6.2%
8 352716
 
3.6%
3 349986
 
3.6%
9 331938
 
3.4%
5 292612
 
3.0%
Other values (3) 619702
 
6.3%

Incident Date
Categorical

Distinct1750
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size39.0 MiB
26-06-2022
 
598
30-06-2019
 
578
01-08-2018
 
556
01-01-2018
 
540
02-10-2019
 
531
Other values (1745)
608092 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters6108950
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)< 0.1%

Sample

1st row25-07-2021
2nd row28-06-2022
3rd row11-03-2022
4th row15-05-2021
5th row28-06-2022

Common Values

ValueCountFrequency (%)
26-06-2022 598
 
0.1%
30-06-2019 578
 
0.1%
01-08-2018 556
 
0.1%
01-01-2018 540
 
0.1%
02-10-2019 531
 
0.1%
24-08-2018 528
 
0.1%
01-02-2019 524
 
0.1%
01-01-2020 519
 
0.1%
03-04-2019 519
 
0.1%
01-11-2019 515
 
0.1%
Other values (1740) 605487
99.1%

Length

2023-04-20T13:49:35.065536image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
26-06-2022 598
 
0.1%
30-06-2019 578
 
0.1%
01-08-2018 556
 
0.1%
01-01-2018 540
 
0.1%
02-10-2019 531
 
0.1%
24-08-2018 528
 
0.1%
01-02-2019 524
 
0.1%
01-01-2020 519
 
0.1%
03-04-2019 519
 
0.1%
01-11-2019 515
 
0.1%
Other values (1740) 605487
99.1%

Most occurring characters

ValueCountFrequency (%)
0 1476602
24.2%
2 1363099
22.3%
- 1221790
20.0%
1 929933
15.2%
8 268586
 
4.4%
9 251343
 
4.1%
3 143945
 
2.4%
7 115875
 
1.9%
5 113384
 
1.9%
6 112350
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4887160
80.0%
Dash Punctuation 1221790
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1476602
30.2%
2 1363099
27.9%
1 929933
19.0%
8 268586
 
5.5%
9 251343
 
5.1%
3 143945
 
2.9%
7 115875
 
2.4%
5 113384
 
2.3%
6 112350
 
2.3%
4 112043
 
2.3%
Dash Punctuation
ValueCountFrequency (%)
- 1221790
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6108950
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1476602
24.2%
2 1363099
22.3%
- 1221790
20.0%
1 929933
15.2%
8 268586
 
4.4%
9 251343
 
4.1%
3 143945
 
2.4%
7 115875
 
1.9%
5 113384
 
1.9%
6 112350
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6108950
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1476602
24.2%
2 1363099
22.3%
- 1221790
20.0%
1 929933
15.2%
8 268586
 
4.4%
9 251343
 
4.1%
3 143945
 
2.4%
7 115875
 
1.9%
5 113384
 
1.9%
6 112350
 
1.8%

Incident Time
Categorical

Distinct1440
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size36.1 MiB
00:00
 
17615
12:00
 
16484
18:00
 
12923
17:00
 
11808
20:00
 
11634
Other values (1435)
540431 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters3054475
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row00:00
2nd row23:58
3rd row10:30
4th row17:47
5th row17:22

Common Values

ValueCountFrequency (%)
00:00 17615
 
2.9%
12:00 16484
 
2.7%
18:00 12923
 
2.1%
17:00 11808
 
1.9%
20:00 11634
 
1.9%
19:00 11349
 
1.9%
15:00 9991
 
1.6%
21:00 9898
 
1.6%
16:00 9775
 
1.6%
22:00 9708
 
1.6%
Other values (1430) 489710
80.2%

Length

2023-04-20T13:49:35.181617image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
00:00 17615
 
2.9%
12:00 16484
 
2.7%
18:00 12923
 
2.1%
17:00 11808
 
1.9%
20:00 11634
 
1.9%
19:00 11349
 
1.9%
15:00 9991
 
1.6%
21:00 9898
 
1.6%
16:00 9775
 
1.6%
22:00 9708
 
1.6%
Other values (1430) 489710
80.2%

Most occurring characters

ValueCountFrequency (%)
0 838514
27.5%
: 610895
20.0%
1 507888
16.6%
2 267750
 
8.8%
3 206041
 
6.7%
5 179228
 
5.9%
4 135629
 
4.4%
8 84130
 
2.8%
9 80595
 
2.6%
7 75859
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2443580
80.0%
Other Punctuation 610895
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 838514
34.3%
1 507888
20.8%
2 267750
 
11.0%
3 206041
 
8.4%
5 179228
 
7.3%
4 135629
 
5.6%
8 84130
 
3.4%
9 80595
 
3.3%
7 75859
 
3.1%
6 67946
 
2.8%
Other Punctuation
ValueCountFrequency (%)
: 610895
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3054475
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 838514
27.5%
: 610895
20.0%
1 507888
16.6%
2 267750
 
8.8%
3 206041
 
6.7%
5 179228
 
5.9%
4 135629
 
4.4%
8 84130
 
2.8%
9 80595
 
2.6%
7 75859
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3054475
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 838514
27.5%
: 610895
20.0%
1 507888
16.6%
2 267750
 
8.8%
3 206041
 
6.7%
5 179228
 
5.9%
4 135629
 
4.4%
8 84130
 
2.8%
9 80595
 
2.6%
7 75859
 
2.5%

Incident Year
Real number (ℝ)

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2019.7531
Minimum2018
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 MiB
2023-04-20T13:49:35.286534image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/

Quantile statistics

Minimum2018
5-th percentile2018
Q12019
median2020
Q32021
95-th percentile2022
Maximum2023
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.4010884
Coefficient of variation (CV)0.00069369293
Kurtosis-1.2837967
Mean2019.7531
Median Absolute Deviation (MAD)1
Skewness0.21485761
Sum1.233857 × 109
Variance1.9630488
MonotonicityNot monotonic
2023-04-20T13:49:35.397537image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2018 152475
25.0%
2019 148061
24.2%
2021 125833
20.6%
2020 96369
15.8%
2022 88148
14.4%
2023 9
 
< 0.1%
ValueCountFrequency (%)
2018 152475
25.0%
2019 148061
24.2%
2020 96369
15.8%
2021 125833
20.6%
2022 88148
14.4%
2023 9
 
< 0.1%
ValueCountFrequency (%)
2023 9
 
< 0.1%
2022 88148
14.4%
2021 125833
20.6%
2020 96369
15.8%
2019 148061
24.2%
2018 152475
25.0%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size37.4 MiB
Friday
93304 
Wednesday
90560 
Monday
86911 
Thursday
86627 
Saturday
86463 
Other values (2)
167030 

Length

Max length9
Median length8
Mean length7.1528086
Min length6

Characters and Unicode

Total characters4369615
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSunday
2nd rowTuesday
3rd rowFriday
4th rowSaturday
5th rowTuesday

Common Values

ValueCountFrequency (%)
Friday 93304
15.3%
Wednesday 90560
14.8%
Monday 86911
14.2%
Thursday 86627
14.2%
Saturday 86463
14.2%
Tuesday 86385
14.1%
Sunday 80645
13.2%

Length

2023-04-20T13:49:35.524533image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-20T13:49:35.690536image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
ValueCountFrequency (%)
friday 93304
15.3%
wednesday 90560
14.8%
monday 86911
14.2%
thursday 86627
14.2%
saturday 86463
14.2%
tuesday 86385
14.1%
sunday 80645
13.2%

Most occurring characters

ValueCountFrequency (%)
d 701455
16.1%
a 697358
16.0%
y 610895
14.0%
u 340120
7.8%
e 267505
 
6.1%
r 266394
 
6.1%
s 263572
 
6.0%
n 258116
 
5.9%
T 173012
 
4.0%
S 167108
 
3.8%
Other values (7) 624080
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3758720
86.0%
Uppercase Letter 610895
 
14.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 701455
18.7%
a 697358
18.6%
y 610895
16.3%
u 340120
9.0%
e 267505
 
7.1%
r 266394
 
7.1%
s 263572
 
7.0%
n 258116
 
6.9%
i 93304
 
2.5%
o 86911
 
2.3%
Other values (2) 173090
 
4.6%
Uppercase Letter
ValueCountFrequency (%)
T 173012
28.3%
S 167108
27.4%
F 93304
15.3%
W 90560
14.8%
M 86911
14.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 4369615
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
d 701455
16.1%
a 697358
16.0%
y 610895
14.0%
u 340120
7.8%
e 267505
 
6.1%
r 266394
 
6.1%
s 263572
 
6.0%
n 258116
 
5.9%
T 173012
 
4.0%
S 167108
 
3.8%
Other values (7) 624080
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4369615
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d 701455
16.1%
a 697358
16.0%
y 610895
14.0%
u 340120
7.8%
e 267505
 
6.1%
r 266394
 
6.1%
s 263572
 
6.0%
n 258116
 
5.9%
T 173012
 
4.0%
S 167108
 
3.8%
Other values (7) 624080
14.3%

Report Datetime
Categorical

HIGH CARDINALITY  UNIFORM 

Distinct438062
Distinct (%)71.7%
Missing0
Missing (%)0.0%
Memory size42.5 MiB
23-11-2021 13:00
 
90
19-04-2022 03:30
 
55
27-06-2018 07:30
 
48
27-02-2019 05:19
 
34
10-10-2019 12:00
 
33
Other values (438057)
610635 

Length

Max length16
Median length16
Mean length16
Min length16

Characters and Unicode

Total characters9774320
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique320472 ?
Unique (%)52.5%

Sample

1st row25-07-2021 13:41
2nd row28-06-2022 23:58
3rd row11-03-2022 20:03
4th row15-05-2021 17:47
5th row28-06-2022 17:22

Common Values

ValueCountFrequency (%)
23-11-2021 13:00 90
 
< 0.1%
19-04-2022 03:30 55
 
< 0.1%
27-06-2018 07:30 48
 
< 0.1%
27-02-2019 05:19 34
 
< 0.1%
10-10-2019 12:00 33
 
< 0.1%
08-02-2018 19:23 26
 
< 0.1%
02-02-2019 14:13 24
 
< 0.1%
26-06-2018 10:27 21
 
< 0.1%
07-12-2018 13:00 21
 
< 0.1%
01-07-2019 18:00 21
 
< 0.1%
Other values (438052) 610522
99.9%

Length

2023-04-20T13:49:35.854534image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
13:00 1959
 
0.2%
15:00 1892
 
0.2%
14:00 1891
 
0.2%
12:00 1872
 
0.2%
16:00 1772
 
0.1%
11:00 1711
 
0.1%
17:00 1608
 
0.1%
10:00 1599
 
0.1%
15:30 1481
 
0.1%
09:00 1445
 
0.1%
Other values (3183) 1204560
98.6%

Most occurring characters

ValueCountFrequency (%)
0 1905759
19.5%
2 1694108
17.3%
1 1525990
15.6%
- 1221790
12.5%
610895
 
6.2%
: 610895
 
6.2%
8 378440
 
3.9%
3 371587
 
3.8%
9 368049
 
3.8%
5 328675
 
3.4%
Other values (3) 758132
 
7.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7330740
75.0%
Dash Punctuation 1221790
 
12.5%
Space Separator 610895
 
6.2%
Other Punctuation 610895
 
6.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1905759
26.0%
2 1694108
23.1%
1 1525990
20.8%
8 378440
 
5.2%
3 371587
 
5.1%
9 368049
 
5.0%
5 328675
 
4.5%
4 313130
 
4.3%
7 225959
 
3.1%
6 219043
 
3.0%
Dash Punctuation
ValueCountFrequency (%)
- 1221790
100.0%
Space Separator
ValueCountFrequency (%)
610895
100.0%
Other Punctuation
ValueCountFrequency (%)
: 610895
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9774320
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1905759
19.5%
2 1694108
17.3%
1 1525990
15.6%
- 1221790
12.5%
610895
 
6.2%
: 610895
 
6.2%
8 378440
 
3.9%
3 371587
 
3.8%
9 368049
 
3.8%
5 328675
 
3.4%
Other values (3) 758132
 
7.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9774320
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1905759
19.5%
2 1694108
17.3%
1 1525990
15.6%
- 1221790
12.5%
610895
 
6.2%
: 610895
 
6.2%
8 378440
 
3.9%
3 371587
 
3.8%
9 368049
 
3.8%
5 328675
 
3.4%
Other values (3) 758132
 
7.8%

Row ID
Real number (ℝ)

Distinct419011
Distinct (%)68.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.0305426 × 1010
Minimum6.1868707 × 1010
Maximum1.23624 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 MiB
2023-04-20T13:49:35.998525image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/

Quantile statistics

Minimum6.1868707 × 1010
5-th percentile6.4824482 × 1010
Q17.5786316 × 1010
median8.9439026 × 1010
Q31.05262 × 1011
95-th percentile1.16349 × 1011
Maximum1.23624 × 1011
Range6.1755293 × 1010
Interquartile range (IQR)2.9475684 × 1010

Descriptive statistics

Standard deviation1.6706844 × 1010
Coefficient of variation (CV)0.18500376
Kurtosis-1.2423387
Mean9.0305426 × 1010
Median Absolute Deviation (MAD)1.4736974 × 1010
Skewness0.043507348
Sum5.5167133 × 1016
Variance2.7911863 × 1020
MonotonicityNot monotonic
2023-04-20T13:49:36.157533image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.05983 × 101121
 
< 0.1%
1.00616 × 101121
 
< 0.1%
1.17265 × 101120
 
< 0.1%
1.09456 × 101120
 
< 0.1%
1.17418 × 101120
 
< 0.1%
1.09985 × 101120
 
< 0.1%
1.00054 × 101120
 
< 0.1%
1.01869 × 101120
 
< 0.1%
1.12407 × 101119
 
< 0.1%
1.18621 × 101119
 
< 0.1%
Other values (419001) 610695
> 99.9%
ValueCountFrequency (%)
6.186870704 × 10101
< 0.1%
6.186910413 × 10101
< 0.1%
6.18691153 × 10101
< 0.1%
6.186970611 × 10101
< 0.1%
6.18699121 × 10101
< 0.1%
6.187010705 × 10101
< 0.1%
6.187016501 × 10101
< 0.1%
6.187016505 × 10101
< 0.1%
6.187020307 × 10101
< 0.1%
6.1870768 × 10101
< 0.1%
ValueCountFrequency (%)
1.23624 × 10111
< 0.1%
1.23607 × 10111
< 0.1%
1.2347 × 10111
< 0.1%
1.23424 × 10111
< 0.1%
1.23375 × 10111
< 0.1%
1.23355 × 10111
< 0.1%
1.2328 × 10111
< 0.1%
1.23239 × 10111
< 0.1%
1.23215 × 10111
< 0.1%
1.23161 × 10111
< 0.1%

Incident ID
Real number (ℝ)

Distinct512521
Distinct (%)83.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean903053.92
Minimum618687
Maximum1236239
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 MiB
2023-04-20T13:49:36.326535image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/

Quantile statistics

Minimum618687
5-th percentile648244.7
Q1757863
median894390
Q31052616
95-th percentile1163489.3
Maximum1236239
Range617552
Interquartile range (IQR)294753

Descriptive statistics

Standard deviation167068.34
Coefficient of variation (CV)0.18500373
Kurtosis-1.2423385
Mean903053.92
Median Absolute Deviation (MAD)147368
Skewness0.043506717
Sum5.5167113 × 1011
Variance2.7911831 × 1010
MonotonicityNot monotonic
2023-04-20T13:49:36.482536image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
640948 4
 
< 0.1%
693983 4
 
< 0.1%
908319 4
 
< 0.1%
1028735 4
 
< 0.1%
944519 4
 
< 0.1%
1078388 4
 
< 0.1%
960299 4
 
< 0.1%
884391 4
 
< 0.1%
689199 4
 
< 0.1%
632466 4
 
< 0.1%
Other values (512511) 610855
> 99.9%
ValueCountFrequency (%)
618687 1
 
< 0.1%
618691 2
< 0.1%
618697 1
 
< 0.1%
618699 1
 
< 0.1%
618701 3
< 0.1%
618702 1
 
< 0.1%
618707 1
 
< 0.1%
618709 2
< 0.1%
618710 3
< 0.1%
618711 1
 
< 0.1%
ValueCountFrequency (%)
1236239 1
< 0.1%
1236072 1
< 0.1%
1234703 1
< 0.1%
1234236 1
< 0.1%
1233754 1
< 0.1%
1233547 1
< 0.1%
1232800 1
< 0.1%
1232386 1
< 0.1%
1232150 1
< 0.1%
1231607 1
< 0.1%

Incident Number
Real number (ℝ)

Distinct445575
Distinct (%)72.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9910225 × 108
Minimum0
Maximum9.8142426 × 108
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size4.7 MiB
2023-04-20T13:49:36.650548image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.8024617 × 108
Q11.9000774 × 108
median2.0004958 × 108
Q32.1052825 × 108
95-th percentile2.2049413 × 108
Maximum9.8142426 × 108
Range9.8142426 × 108
Interquartile range (IQR)20520509

Descriptive statistics

Standard deviation14541705
Coefficient of variation (CV)0.07303637
Kurtosis71.569407
Mean1.9910225 × 108
Median Absolute Deviation (MAD)10471194
Skewness1.4657063
Sum1.2163057 × 1014
Variance2.1146119 × 1014
MonotonicityNot monotonic
2023-04-20T13:49:36.812536image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
190202001 52
 
< 0.1%
210330778 47
 
< 0.1%
190071345 34
 
< 0.1%
210394611 28
 
< 0.1%
210505959 21
 
< 0.1%
200080808 21
 
< 0.1%
190129518 20
 
< 0.1%
180354292 20
 
< 0.1%
190176490 18
 
< 0.1%
210724717 17
 
< 0.1%
Other values (445565) 610617
> 99.9%
ValueCountFrequency (%)
0 2
< 0.1%
1131000 1
< 0.1%
1808670 1
< 0.1%
1813494 1
< 0.1%
1819855 1
< 0.1%
1819873 1
< 0.1%
1831758 1
< 0.1%
1831875 1
< 0.1%
2000558 1
< 0.1%
2001459 1
< 0.1%
ValueCountFrequency (%)
981424262 1
 
< 0.1%
981171996 1
 
< 0.1%
970332979 1
 
< 0.1%
940072058 1
 
< 0.1%
793282725 1
 
< 0.1%
782312915 3
< 0.1%
700013570 1
 
< 0.1%
270762961 1
 
< 0.1%
251030935 1
 
< 0.1%
236011063 1
 
< 0.1%

CAD Number
Real number (ℝ)

HIGH CORRELATION  MISSING  SKEWED 

Distinct351483
Distinct (%)74.2%
Missing137235
Missing (%)22.5%
Infinite0
Infinite (%)0.0%
Mean1.9993409 × 108
Minimum1
Maximum1 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 MiB
2023-04-20T13:49:36.978534image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.8079001 × 108
Q11.9013253 × 108
median2.0030147 × 108
Q32.1196002 × 108
95-th percentile2.2171215 × 108
Maximum1 × 109
Range1 × 109
Interquartile range (IQR)21827496

Descriptive statistics

Standard deviation22492841
Coefficient of variation (CV)0.11250128
Kurtosis770.47581
Mean1.9993409 × 108
Median Absolute Deviation (MAD)11501008
Skewness21.756041
Sum9.4700782 × 1013
Variance5.0592788 × 1014
MonotonicityNot monotonic
2023-04-20T13:49:37.136535image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
999999999 225
 
< 0.1%
190531525 35
 
< 0.1%
180393444 26
 
< 0.1%
200453221 23
 
< 0.1%
213631865 22
 
< 0.1%
180681633 20
 
< 0.1%
211483573 20
 
< 0.1%
220292654 18
 
< 0.1%
213081374 17
 
< 0.1%
220271692 17
 
< 0.1%
Other values (351473) 473237
77.5%
(Missing) 137235
 
22.5%
ValueCountFrequency (%)
1 6
< 0.1%
18012428 1
 
< 0.1%
18165248 1
 
< 0.1%
18237257 1
 
< 0.1%
18303153 3
< 0.1%
19122287 2
 
< 0.1%
20085321 1
 
< 0.1%
20188282 1
 
< 0.1%
21164231 1
 
< 0.1%
22210522 1
 
< 0.1%
ValueCountFrequency (%)
999999999 225
< 0.1%
999990999 2
 
< 0.1%
982560450 1
 
< 0.1%
818632476 2
 
< 0.1%
628336061 1
 
< 0.1%
418221357 2
 
< 0.1%
400083487 1
 
< 0.1%
303181516 1
 
< 0.1%
301580764 1
 
< 0.1%
290002510 1
 
< 0.1%

Report Type Code
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size34.4 MiB
II
483861 
IS
63907 
VI
 
37608
VS
 
25519

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1221790
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowII
2nd rowVS
3rd rowII
4th rowVS
5th rowVS

Common Values

ValueCountFrequency (%)
II 483861
79.2%
IS 63907
 
10.5%
VI 37608
 
6.2%
VS 25519
 
4.2%

Length

2023-04-20T13:49:37.269562image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-20T13:49:37.399536image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
ValueCountFrequency (%)
ii 483861
79.2%
is 63907
 
10.5%
vi 37608
 
6.2%
vs 25519
 
4.2%

Most occurring characters

ValueCountFrequency (%)
I 1069237
87.5%
S 89426
 
7.3%
V 63127
 
5.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1221790
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 1069237
87.5%
S 89426
 
7.3%
V 63127
 
5.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 1221790
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 1069237
87.5%
S 89426
 
7.3%
V 63127
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1221790
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 1069237
87.5%
S 89426
 
7.3%
V 63127
 
5.2%
Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size39.5 MiB
Initial
373544 
Coplogic Initial
110317 
Initial Supplement
50049 
Vehicle Initial
37608 
Vehicle Supplement
 
25519

Length

Max length19
Median length7
Mean length10.750663
Min length7

Characters and Unicode

Total characters6567526
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCoplogic Initial
2nd rowVehicle Supplement
3rd rowCoplogic Initial
4th rowVehicle Supplement
5th rowVehicle Supplement

Common Values

ValueCountFrequency (%)
Initial 373544
61.1%
Coplogic Initial 110317
 
18.1%
Initial Supplement 50049
 
8.2%
Vehicle Initial 37608
 
6.2%
Vehicle Supplement 25519
 
4.2%
Coplogic Supplement 13858
 
2.3%

Length

2023-04-20T13:49:37.518560image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-20T13:49:37.666536image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
ValueCountFrequency (%)
initial 571518
67.4%
coplogic 124175
 
14.6%
supplement 89426
 
10.5%
vehicle 63127
 
7.4%

Most occurring characters

ValueCountFrequency (%)
i 1330338
20.3%
l 848246
12.9%
t 660944
10.1%
n 660944
10.1%
I 571518
8.7%
a 571518
8.7%
e 305106
 
4.6%
p 303027
 
4.6%
o 248350
 
3.8%
237351
 
3.6%
Other values (8) 830184
12.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5481929
83.5%
Uppercase Letter 848246
 
12.9%
Space Separator 237351
 
3.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1330338
24.3%
l 848246
15.5%
t 660944
12.1%
n 660944
12.1%
a 571518
10.4%
e 305106
 
5.6%
p 303027
 
5.5%
o 248350
 
4.5%
c 187302
 
3.4%
g 124175
 
2.3%
Other values (3) 241979
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
I 571518
67.4%
C 124175
 
14.6%
S 89426
 
10.5%
V 63127
 
7.4%
Space Separator
ValueCountFrequency (%)
237351
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6330175
96.4%
Common 237351
 
3.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1330338
21.0%
l 848246
13.4%
t 660944
10.4%
n 660944
10.4%
I 571518
9.0%
a 571518
9.0%
e 305106
 
4.8%
p 303027
 
4.8%
o 248350
 
3.9%
c 187302
 
3.0%
Other values (7) 642882
10.2%
Common
ValueCountFrequency (%)
237351
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6567526
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1330338
20.3%
l 848246
12.9%
t 660944
10.1%
n 660944
10.1%
I 571518
8.7%
a 571518
8.7%
e 305106
 
4.6%
p 303027
 
4.6%
o 248350
 
3.8%
237351
 
3.6%
Other values (8) 830184
12.6%

Filed Online
Boolean

CONSTANT  MISSING 

Distinct1
Distinct (%)< 0.1%
Missing486720
Missing (%)79.7%
Memory size19.1 MiB
True
124175 
(Missing)
486720 
ValueCountFrequency (%)
True 124175
 
20.3%
(Missing) 486720
79.7%
2023-04-20T13:49:37.813874image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/

Incident Code
Real number (ℝ)

Distinct832
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24735.751
Minimum1000
Maximum75030
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 MiB
2023-04-20T13:49:38.083874image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile4134
Q16244
median7041
Q351040
95-th percentile71024
Maximum75030
Range74030
Interquartile range (IQR)44796

Descriptive statistics

Standard deviation25703.749
Coefficient of variation (CV)1.0391335
Kurtosis-0.84179728
Mean24735.751
Median Absolute Deviation (MAD)2907
Skewness0.95103562
Sum1.5110947 × 1010
Variance6.606827 × 108
MonotonicityNot monotonic
2023-04-20T13:49:38.232874image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6244 78103
 
12.8%
28150 20379
 
3.3%
71000 18383
 
3.0%
4134 17928
 
2.9%
6372 17433
 
2.9%
7041 16716
 
2.7%
7021 16136
 
2.6%
6374 14547
 
2.4%
64020 13831
 
2.3%
28160 11333
 
1.9%
Other values (822) 386106
63.2%
ValueCountFrequency (%)
1000 8
 
< 0.1%
1001 11
 
< 0.1%
1002 4
 
< 0.1%
1003 4
 
< 0.1%
1004 2
 
< 0.1%
1005 1
 
< 0.1%
1160 45
< 0.1%
2001 1
 
< 0.1%
2002 1
 
< 0.1%
2003 2
 
< 0.1%
ValueCountFrequency (%)
75030 2497
 
0.4%
75025 2204
 
0.4%
75011 8
 
< 0.1%
75000 6649
1.1%
74024 3
 
< 0.1%
74020 13
 
< 0.1%
74000 6673
1.1%
73010 514
 
0.1%
73001 19
 
< 0.1%
73000 167
 
< 0.1%
Distinct49
Distinct (%)< 0.1%
Missing495
Missing (%)0.1%
Memory size41.2 MiB
Larceny Theft
187395 
Other Miscellaneous
43264 
Malicious Mischief
41220 
Assault
37003 
Non-Criminal
36842 
Other values (44)
264676 

Length

Max length44
Median length40
Mean length13.705729
Min length4

Characters and Unicode

Total characters8365977
Distinct characters50
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowLarceny Theft
2nd rowOther Offenses
3rd rowLost Property
4th rowRecovered Vehicle
5th rowRecovered Vehicle

Common Values

ValueCountFrequency (%)
Larceny Theft 187395
30.7%
Other Miscellaneous 43264
 
7.1%
Malicious Mischief 41220
 
6.7%
Assault 37003
 
6.1%
Non-Criminal 36842
 
6.0%
Burglary 33837
 
5.5%
Motor Vehicle Theft 29882
 
4.9%
Recovered Vehicle 22925
 
3.8%
Fraud 19052
 
3.1%
Lost Property 18383
 
3.0%
Other values (39) 140597
23.0%

Length

2023-04-20T13:49:38.387886image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
theft 217345
19.3%
larceny 187395
16.6%
vehicle 53529
 
4.7%
other 53204
 
4.7%
miscellaneous 49429
 
4.4%
malicious 41220
 
3.7%
mischief 41220
 
3.7%
assault 37003
 
3.3%
non-criminal 36842
 
3.3%
burglary 33837
 
3.0%
Other values (62) 377193
33.4%

Most occurring characters

ValueCountFrequency (%)
e 960517
 
11.5%
r 611292
 
7.3%
517817
 
6.2%
a 496201
 
5.9%
i 480020
 
5.7%
n 470069
 
5.6%
c 463792
 
5.5%
t 458015
 
5.5%
s 428611
 
5.1%
h 382288
 
4.6%
Other values (40) 3097355
37.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6645810
79.4%
Uppercase Letter 1165059
 
13.9%
Space Separator 517817
 
6.2%
Dash Punctuation 36842
 
0.4%
Other Punctuation 207
 
< 0.1%
Open Punctuation 121
 
< 0.1%
Close Punctuation 121
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 960517
14.5%
r 611292
 
9.2%
a 496201
 
7.5%
i 480020
 
7.2%
n 470069
 
7.1%
c 463792
 
7.0%
t 458015
 
6.9%
s 428611
 
6.4%
h 382288
 
5.8%
o 357746
 
5.4%
Other values (15) 1537259
23.1%
Uppercase Letter
ValueCountFrequency (%)
T 233514
20.0%
L 205906
17.7%
M 175395
15.1%
O 98610
8.5%
C 70307
 
6.0%
A 64297
 
5.5%
V 61170
 
5.3%
R 39467
 
3.4%
N 36842
 
3.2%
P 35664
 
3.1%
Other values (9) 143887
12.4%
Other Punctuation
ValueCountFrequency (%)
, 139
67.1%
? 68
32.9%
Space Separator
ValueCountFrequency (%)
517817
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 36842
100.0%
Open Punctuation
ValueCountFrequency (%)
( 121
100.0%
Close Punctuation
ValueCountFrequency (%)
) 121
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7810869
93.4%
Common 555108
 
6.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 960517
 
12.3%
r 611292
 
7.8%
a 496201
 
6.4%
i 480020
 
6.1%
n 470069
 
6.0%
c 463792
 
5.9%
t 458015
 
5.9%
s 428611
 
5.5%
h 382288
 
4.9%
o 357746
 
4.6%
Other values (34) 2702318
34.6%
Common
ValueCountFrequency (%)
517817
93.3%
- 36842
 
6.6%
, 139
 
< 0.1%
( 121
 
< 0.1%
) 121
 
< 0.1%
? 68
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8365977
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 960517
 
11.5%
r 611292
 
7.3%
517817
 
6.2%
a 496201
 
5.9%
i 480020
 
5.7%
n 470069
 
5.6%
c 463792
 
5.5%
t 458015
 
5.5%
s 428611
 
5.1%
h 382288
 
4.6%
Other values (40) 3097355
37.0%

Incident Subcategory
Categorical

HIGH CARDINALITY  HIGH CORRELATION 

Distinct71
Distinct (%)< 0.1%
Missing495
Missing (%)0.1%
Memory size42.3 MiB
Larceny - From Vehicle
106475 
Other
77365 
Larceny Theft - Other
43356 
Vandalism
40882 
Motor Vehicle Theft
 
29484
Other values (66)
312838 

Length

Max length40
Median length29
Mean length15.578242
Min length4

Characters and Unicode

Total characters9508959
Distinct characters52
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowLarceny Theft - Other
2nd rowOther Offenses
3rd rowLost Property
4th rowRecovered Vehicle
5th rowRecovered Vehicle

Common Values

ValueCountFrequency (%)
Larceny - From Vehicle 106475
17.4%
Other 77365
 
12.7%
Larceny Theft - Other 43356
 
7.1%
Vandalism 40882
 
6.7%
Motor Vehicle Theft 29484
 
4.8%
Simple Assault 23050
 
3.8%
Recovered Vehicle 22925
 
3.8%
Non-Criminal 21004
 
3.4%
Fraud 19980
 
3.3%
Lost Property 18383
 
3.0%
Other values (61) 207496
34.0%

Length

2023-04-20T13:49:38.552958image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
227525
15.1%
larceny 178473
11.9%
vehicle 168930
11.2%
other 144441
 
9.6%
from 125630
 
8.4%
theft 109004
 
7.2%
vandalism 40882
 
2.7%
assault 37004
 
2.5%
burglary 33837
 
2.2%
motor 29950
 
2.0%
Other values (81) 408339
27.1%

Most occurring characters

ValueCountFrequency (%)
e 1050525
 
11.0%
893615
 
9.4%
r 785689
 
8.3%
i 515380
 
5.4%
a 507130
 
5.3%
t 504543
 
5.3%
c 442465
 
4.7%
h 434534
 
4.6%
o 431186
 
4.5%
l 425520
 
4.5%
Other values (42) 3518372
37.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7067732
74.3%
Uppercase Letter 1297753
 
13.6%
Space Separator 893615
 
9.4%
Dash Punctuation 248082
 
2.6%
Other Punctuation 845
 
< 0.1%
Open Punctuation 466
 
< 0.1%
Close Punctuation 466
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1050525
14.9%
r 785689
11.1%
i 515380
 
7.3%
a 507130
 
7.2%
t 504543
 
7.1%
c 442465
 
6.3%
h 434534
 
6.1%
o 431186
 
6.1%
l 425520
 
6.0%
n 420259
 
5.9%
Other values (16) 1550501
21.9%
Uppercase Letter
ValueCountFrequency (%)
V 231435
17.8%
L 198676
15.3%
O 168735
13.0%
F 148630
11.5%
T 122769
9.5%
A 70202
 
5.4%
S 55139
 
4.2%
R 52001
 
4.0%
M 49726
 
3.8%
B 46680
 
3.6%
Other values (10) 153760
11.8%
Other Punctuation
ValueCountFrequency (%)
& 706
83.6%
, 139
 
16.4%
Space Separator
ValueCountFrequency (%)
893615
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 248082
100.0%
Open Punctuation
ValueCountFrequency (%)
( 466
100.0%
Close Punctuation
ValueCountFrequency (%)
) 466
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8365485
88.0%
Common 1143474
 
12.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1050525
 
12.6%
r 785689
 
9.4%
i 515380
 
6.2%
a 507130
 
6.1%
t 504543
 
6.0%
c 442465
 
5.3%
h 434534
 
5.2%
o 431186
 
5.2%
l 425520
 
5.1%
n 420259
 
5.0%
Other values (36) 2848254
34.0%
Common
ValueCountFrequency (%)
893615
78.1%
- 248082
 
21.7%
& 706
 
0.1%
( 466
 
< 0.1%
) 466
 
< 0.1%
, 139
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9508959
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1050525
 
11.0%
893615
 
9.4%
r 785689
 
8.3%
i 515380
 
5.4%
a 507130
 
5.3%
t 504543
 
5.3%
c 442465
 
4.7%
h 434534
 
4.6%
o 431186
 
4.5%
l 425520
 
4.5%
Other values (42) 3518372
37.0%
Distinct829
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size50.3 MiB
Theft, From Locked Vehicle, >$950
78103 
Malicious Mischief, Vandalism to Property
 
20379
Lost Property
 
18383
Battery
 
17928
Theft, Other Property, $50-$200
 
17433
Other values (824)
458669 

Length

Max length84
Median length58
Mean length29.320744
Min length4

Characters and Unicode

Total characters17911896
Distinct characters73
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique82 ?
Unique (%)< 0.1%

Sample

1st rowTheft, Other Property, $50-$200
2nd rowLicense Plate, Recovered
3rd rowLost Property
4th rowVehicle, Recovered, Motorcycle
5th rowVehicle, Recovered, Auto

Common Values

ValueCountFrequency (%)
Theft, From Locked Vehicle, >$950 78103
 
12.8%
Malicious Mischief, Vandalism to Property 20379
 
3.3%
Lost Property 18383
 
3.0%
Battery 17928
 
2.9%
Theft, Other Property, $50-$200 17433
 
2.9%
Vehicle, Recovered, Auto 16716
 
2.7%
Vehicle, Stolen, Auto 16136
 
2.6%
Theft, Other Property, >$950 14547
 
2.4%
Mental Health Detention 13831
 
2.3%
Malicious Mischief, Vandalism to Vehicle 11333
 
1.9%
Other values (819) 386106
63.2%

Length

2023-04-20T13:49:38.724874image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
vehicle 180603
 
7.4%
theft 179865
 
7.4%
from 122383
 
5.0%
950 116701
 
4.8%
property 97563
 
4.0%
locked 92886
 
3.8%
other 58170
 
2.4%
to 46991
 
1.9%
stolen 40618
 
1.7%
malicious 39217
 
1.6%
Other values (912) 1451277
59.8%

Most occurring characters

ValueCountFrequency (%)
1830806
 
10.2%
e 1801589
 
10.1%
r 1076202
 
6.0%
o 1071925
 
6.0%
t 1060662
 
5.9%
i 950707
 
5.3%
, 754953
 
4.2%
c 690176
 
3.9%
n 689963
 
3.9%
l 669866
 
3.7%
Other values (63) 7315047
40.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12066864
67.4%
Uppercase Letter 2138410
 
11.9%
Space Separator 1830806
 
10.2%
Other Punctuation 814089
 
4.5%
Decimal Number 633473
 
3.5%
Currency Symbol 222273
 
1.2%
Math Symbol 123371
 
0.7%
Dash Punctuation 56298
 
0.3%
Open Punctuation 13156
 
0.1%
Close Punctuation 13156
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1801589
14.9%
r 1076202
 
8.9%
o 1071925
 
8.9%
t 1060662
 
8.8%
i 950707
 
7.9%
c 690176
 
5.7%
n 689963
 
5.7%
l 669866
 
5.6%
a 632693
 
5.2%
s 592072
 
4.9%
Other values (16) 2831009
23.5%
Uppercase Letter
ValueCountFrequency (%)
T 243672
11.4%
V 232734
10.9%
F 221075
10.3%
P 202608
9.5%
L 148509
 
6.9%
A 143141
 
6.7%
M 132149
 
6.2%
S 123931
 
5.8%
O 107231
 
5.0%
B 93895
 
4.4%
Other values (13) 489465
22.9%
Decimal Number
ValueCountFrequency (%)
0 271892
42.9%
5 172817
27.3%
9 139051
22.0%
2 49457
 
7.8%
1 185
 
< 0.1%
3 23
 
< 0.1%
6 20
 
< 0.1%
8 17
 
< 0.1%
4 9
 
< 0.1%
7 2
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
, 754953
92.7%
/ 29014
 
3.6%
. 26138
 
3.2%
& 3903
 
0.5%
" 72
 
< 0.1%
' 5
 
< 0.1%
; 4
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
> 116701
94.6%
< 6670
 
5.4%
Space Separator
ValueCountFrequency (%)
1830806
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 222273
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 56298
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13156
100.0%
Close Punctuation
ValueCountFrequency (%)
) 13156
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14205274
79.3%
Common 3706622
 
20.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1801589
 
12.7%
r 1076202
 
7.6%
o 1071925
 
7.5%
t 1060662
 
7.5%
i 950707
 
6.7%
c 690176
 
4.9%
n 689963
 
4.9%
l 669866
 
4.7%
a 632693
 
4.5%
s 592072
 
4.2%
Other values (39) 4969419
35.0%
Common
ValueCountFrequency (%)
1830806
49.4%
, 754953
20.4%
0 271892
 
7.3%
$ 222273
 
6.0%
5 172817
 
4.7%
9 139051
 
3.8%
> 116701
 
3.1%
- 56298
 
1.5%
2 49457
 
1.3%
/ 29014
 
0.8%
Other values (14) 63360
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17911896
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1830806
 
10.2%
e 1801589
 
10.1%
r 1076202
 
6.0%
o 1071925
 
6.0%
t 1060662
 
5.9%
i 950707
 
5.3%
, 754953
 
4.2%
c 690176
 
3.9%
n 689963
 
3.9%
l 669866
 
3.7%
Other values (63) 7315047
40.8%

Resolution
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size42.0 MiB
Open or Active
486889 
Cite or Arrest Adult
119011 
Unfounded
 
3403
Exceptional Adult
 
1592

Length

Max length20
Median length14
Mean length15.14885
Min length9

Characters and Unicode

Total characters9254357
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOpen or Active
2nd rowOpen or Active
3rd rowOpen or Active
4th rowOpen or Active
5th rowOpen or Active

Common Values

ValueCountFrequency (%)
Open or Active 486889
79.7%
Cite or Arrest Adult 119011
 
19.5%
Unfounded 3403
 
0.6%
Exceptional Adult 1592
 
0.3%

Length

2023-04-20T13:49:38.876874image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-20T13:49:39.016873image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
ValueCountFrequency (%)
or 605900
31.2%
open 486889
25.1%
active 486889
25.1%
adult 120603
 
6.2%
cite 119011
 
6.1%
arrest 119011
 
6.1%
unfounded 3403
 
0.2%
exceptional 1592
 
0.1%

Most occurring characters

ValueCountFrequency (%)
1332403
14.4%
e 1216795
13.1%
t 847106
9.2%
r 843922
9.1%
A 726503
7.9%
o 610895
 
6.6%
i 607492
 
6.6%
n 495287
 
5.4%
p 488481
 
5.3%
c 488481
 
5.3%
Other values (12) 1596992
17.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6584556
71.2%
Uppercase Letter 1337398
 
14.5%
Space Separator 1332403
 
14.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1216795
18.5%
t 847106
12.9%
r 843922
12.8%
o 610895
9.3%
i 607492
9.2%
n 495287
7.5%
p 488481
7.4%
c 488481
7.4%
v 486889
7.4%
d 127409
 
1.9%
Other values (6) 371799
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
A 726503
54.3%
O 486889
36.4%
C 119011
 
8.9%
U 3403
 
0.3%
E 1592
 
0.1%
Space Separator
ValueCountFrequency (%)
1332403
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7921954
85.6%
Common 1332403
 
14.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1216795
15.4%
t 847106
10.7%
r 843922
10.7%
A 726503
9.2%
o 610895
7.7%
i 607492
7.7%
n 495287
6.3%
p 488481
6.2%
c 488481
6.2%
O 486889
6.1%
Other values (11) 1110103
14.0%
Common
ValueCountFrequency (%)
1332403
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9254357
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1332403
14.4%
e 1216795
13.1%
t 847106
9.2%
r 843922
9.1%
A 726503
7.9%
o 610895
 
6.6%
i 607492
 
6.6%
n 495287
 
5.4%
p 488481
 
5.3%
c 488481
 
5.3%
Other values (12) 1596992
17.3%

Intersection
Categorical

HIGH CARDINALITY  MISSING 

Distinct6373
Distinct (%)1.1%
Missing32624
Missing (%)5.3%
Memory size45.2 MiB
MARKET ST \ POWELL ST
 
3545
POWELL ST \ OFARRELL ST
 
2974
BOARDMAN PL \ BRYANT ST
 
2903
EDDY ST \ JONES ST
 
2575
20TH AVE \ WINSTON DR
 
2414
Other values (6368)
563860 

Length

Max length84
Median length60
Mean length23.218145
Min length12

Characters and Unicode

Total characters13426380
Distinct characters39
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)< 0.1%

Sample

1st rowEXCELSIOR AVE \ MISSION ST
2nd rowNORTH POINT ST \ LARKIN ST
3rd rowFELL ST \ ASHBURY ST
4th rowFILLMORE ST \ SACRAMENTO ST
5th rowJERROLD AVE \ PHELPS ST

Common Values

ValueCountFrequency (%)
MARKET ST \ POWELL ST 3545
 
0.6%
POWELL ST \ OFARRELL ST 2974
 
0.5%
BOARDMAN PL \ BRYANT ST 2903
 
0.5%
EDDY ST \ JONES ST 2575
 
0.4%
20TH AVE \ WINSTON DR 2414
 
0.4%
16TH ST \ MISSION ST 2391
 
0.4%
08TH ST \ GROVE ST \ HYDE ST \ MARKET ST 2333
 
0.4%
04TH ST \ LONG BRIDGE ST 2113
 
0.3%
HYDE ST \ TURK ST 2002
 
0.3%
UNITED NATIONS PLZ \ LEAVENWORTH ST 1999
 
0.3%
Other values (6363) 553022
90.5%
(Missing) 32624
 
5.3%

Length

2023-04-20T13:49:39.186889image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
st 885804
28.4%
621448
19.9%
ave 190737
 
6.1%
mission 29678
 
1.0%
dr 24894
 
0.8%
blvd 23201
 
0.7%
market 22015
 
0.7%
eddy 15317
 
0.5%
way 15261
 
0.5%
geary 14715
 
0.5%
Other values (2051) 1275809
40.9%

Most occurring characters

ValueCountFrequency (%)
2540608
18.9%
T 1435301
10.7%
S 1328829
 
9.9%
A 930273
 
6.9%
E 861040
 
6.4%
\ 621448
 
4.6%
N 613583
 
4.6%
R 589051
 
4.4%
O 583456
 
4.3%
L 523076
 
3.9%
Other values (29) 3399715
25.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 9901346
73.7%
Space Separator 2540608
 
18.9%
Other Punctuation 621448
 
4.6%
Decimal Number 362541
 
2.7%
Dash Punctuation 437
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 1435301
14.5%
S 1328829
13.4%
A 930273
9.4%
E 861040
 
8.7%
N 613583
 
6.2%
R 589051
 
5.9%
O 583456
 
5.9%
L 523076
 
5.3%
I 401337
 
4.1%
H 345321
 
3.5%
Other values (16) 2290079
23.1%
Decimal Number
ValueCountFrequency (%)
0 77633
21.4%
1 67359
18.6%
2 59822
16.5%
4 32315
8.9%
6 27808
 
7.7%
3 26658
 
7.4%
8 19609
 
5.4%
9 17573
 
4.8%
7 17160
 
4.7%
5 16604
 
4.6%
Space Separator
ValueCountFrequency (%)
2540608
100.0%
Other Punctuation
ValueCountFrequency (%)
\ 621448
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 437
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9901346
73.7%
Common 3525034
 
26.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 1435301
14.5%
S 1328829
13.4%
A 930273
9.4%
E 861040
 
8.7%
N 613583
 
6.2%
R 589051
 
5.9%
O 583456
 
5.9%
L 523076
 
5.3%
I 401337
 
4.1%
H 345321
 
3.5%
Other values (16) 2290079
23.1%
Common
ValueCountFrequency (%)
2540608
72.1%
\ 621448
 
17.6%
0 77633
 
2.2%
1 67359
 
1.9%
2 59822
 
1.7%
4 32315
 
0.9%
6 27808
 
0.8%
3 26658
 
0.8%
8 19609
 
0.6%
9 17573
 
0.5%
Other values (3) 34201
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13426380
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2540608
18.9%
T 1435301
10.7%
S 1328829
 
9.9%
A 930273
 
6.9%
E 861040
 
6.4%
\ 621448
 
4.6%
N 613583
 
4.6%
R 589051
 
4.4%
O 583456
 
4.3%
L 523076
 
3.9%
Other values (29) 3399715
25.3%

CNN
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct6460
Distinct (%)1.1%
Missing32624
Missing (%)5.3%
Infinite0
Infinite (%)0.0%
Mean25330875
Minimum20013000
Maximum54203000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 MiB
2023-04-20T13:49:39.356873image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/

Quantile statistics

Minimum20013000
5-th percentile20618000
Q123967000
median24924000
Q326469000
95-th percentile33229000
Maximum54203000
Range34190000
Interquartile range (IQR)2502000

Descriptive statistics

Standard deviation3095514.3
Coefficient of variation (CV)0.12220321
Kurtosis5.5925796
Mean25330875
Median Absolute Deviation (MAD)1120000
Skewness1.4918742
Sum1.4648111 × 1013
Variance9.5822087 × 1012
MonotonicityNot monotonic
2023-04-20T13:49:39.506875image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34016000 3545
 
0.6%
24904000 2974
 
0.5%
23914000 2903
 
0.5%
24929000 2575
 
0.4%
33719000 2414
 
0.4%
24170000 2391
 
0.4%
24429000 2333
 
0.4%
34168000 2113
 
0.3%
24933000 2002
 
0.3%
30044000 1999
 
0.3%
Other values (6450) 553022
90.5%
(Missing) 32624
 
5.3%
ValueCountFrequency (%)
20013000 139
< 0.1%
20034000 84
 
< 0.1%
20039000 76
 
< 0.1%
20041000 203
< 0.1%
20044000 108
 
< 0.1%
20046000 172
< 0.1%
20056000 170
< 0.1%
20058000 234
< 0.1%
20060000 326
0.1%
20061000 87
 
< 0.1%
ValueCountFrequency (%)
54203000 1
 
< 0.1%
54122000 5
 
< 0.1%
54004000 4
 
< 0.1%
51555000 3
 
< 0.1%
51545000 1
 
< 0.1%
51541000 20
< 0.1%
51535000 3
 
< 0.1%
51527000 12
 
< 0.1%
51484000 3
 
< 0.1%
51483000 41
< 0.1%

Police District
Categorical

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size37.7 MiB
Central
91668 
Northern
82733 
Mission
77568 
Southern
74473 
Tenderloin
58467 
Other values (6)
225986 

Length

Max length10
Median length9
Mean length7.674363
Min length4

Characters and Unicode

Total characters4688230
Distinct characters32
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSouthern
2nd rowOut of SF
3rd rowCentral
4th rowOut of SF
5th rowOut of SF

Common Values

ValueCountFrequency (%)
Central 91668
15.0%
Northern 82733
13.5%
Mission 77568
12.7%
Southern 74473
12.2%
Tenderloin 58467
9.6%
Bayview 53473
8.8%
Ingleside 45495
7.4%
Taraval 42760
7.0%
Richmond 38028
6.2%
Park 28424
 
4.7%

Length

2023-04-20T13:49:39.657875image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
central 91668
14.2%
northern 82733
12.8%
mission 77568
12.0%
southern 74473
11.5%
tenderloin 58467
9.0%
bayview 53473
8.3%
ingleside 45495
7.0%
taraval 42760
6.6%
richmond 38028
5.9%
park 28424
 
4.4%
Other values (3) 53418
8.3%

Most occurring characters

ValueCountFrequency (%)
n 526899
11.2%
e 510271
 
10.9%
r 461258
 
9.8%
i 350599
 
7.5%
o 349075
 
7.4%
a 301845
 
6.4%
t 266680
 
5.7%
l 238390
 
5.1%
s 200631
 
4.3%
h 195234
 
4.2%
Other values (22) 1287348
27.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4006111
85.5%
Uppercase Letter 646507
 
13.8%
Space Separator 35612
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 526899
13.2%
e 510271
12.7%
r 461258
11.5%
i 350599
8.8%
o 349075
8.7%
a 301845
7.5%
t 266680
6.7%
l 238390
 
6.0%
s 200631
 
5.0%
h 195234
 
4.9%
Other values (10) 605229
15.1%
Uppercase Letter
ValueCountFrequency (%)
T 101227
15.7%
S 92279
14.3%
C 91668
14.2%
N 82733
12.8%
M 77568
12.0%
B 53473
8.3%
I 45495
7.0%
R 38028
 
5.9%
P 28424
 
4.4%
O 17806
 
2.8%
Space Separator
ValueCountFrequency (%)
35612
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4652618
99.2%
Common 35612
 
0.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 526899
11.3%
e 510271
11.0%
r 461258
 
9.9%
i 350599
 
7.5%
o 349075
 
7.5%
a 301845
 
6.5%
t 266680
 
5.7%
l 238390
 
5.1%
s 200631
 
4.3%
h 195234
 
4.2%
Other values (21) 1251736
26.9%
Common
ValueCountFrequency (%)
35612
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4688230
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 526899
11.2%
e 510271
 
10.9%
r 461258
 
9.8%
i 350599
 
7.5%
o 349075
 
7.4%
a 301845
 
6.4%
t 266680
 
5.7%
l 238390
 
5.1%
s 200631
 
4.3%
h 195234
 
4.2%
Other values (22) 1287348
27.5%

Analysis Neighborhood
Categorical

HIGH CORRELATION  MISSING 

Distinct41
Distinct (%)< 0.1%
Missing32738
Missing (%)5.4%
Memory size40.2 MiB
Mission
62476 
Tenderloin
59256 
Financial District/South Beach
48559 
South of Market
47079 
Bayview Hunters Point
37497 
Other values (36)
323290 

Length

Max length30
Median length18
Mean length14.070759
Min length6

Characters and Unicode

Total characters8135108
Distinct characters46
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowExcelsior
2nd rowRussian Hill
3rd rowLone Mountain/USF
4th rowPacific Heights
5th rowBayview Hunters Point

Common Values

ValueCountFrequency (%)
Mission 62476
 
10.2%
Tenderloin 59256
 
9.7%
Financial District/South Beach 48559
 
7.9%
South of Market 47079
 
7.7%
Bayview Hunters Point 37497
 
6.1%
North Beach 19302
 
3.2%
Western Addition 18639
 
3.1%
Castro/Upper Market 17481
 
2.9%
Sunset/Parkside 17211
 
2.8%
Nob Hill 16667
 
2.7%
Other values (31) 233990
38.3%
(Missing) 32738
 
5.4%

Length

2023-04-20T13:49:39.789362image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
mission 81260
 
7.3%
beach 67861
 
6.1%
market 64560
 
5.8%
tenderloin 59256
 
5.3%
of 58817
 
5.3%
financial 48559
 
4.4%
district/south 48559
 
4.4%
south 47079
 
4.2%
hill 40529
 
3.6%
point 37497
 
3.4%
Other values (46) 560815
50.3%

Most occurring characters

ValueCountFrequency (%)
i 778052
 
9.6%
e 676283
 
8.3%
n 628962
 
7.7%
t 551608
 
6.8%
536635
 
6.6%
o 531311
 
6.5%
a 527962
 
6.5%
s 469918
 
5.8%
r 443813
 
5.5%
l 292033
 
3.6%
Other values (36) 2698531
33.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6322067
77.7%
Uppercase Letter 1173594
 
14.4%
Space Separator 536635
 
6.6%
Other Punctuation 102812
 
1.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 778052
12.3%
e 676283
10.7%
n 628962
9.9%
t 551608
8.7%
o 531311
8.4%
a 527962
8.4%
s 469918
7.4%
r 443813
 
7.0%
l 292033
 
4.6%
h 266229
 
4.2%
Other values (13) 1155896
18.3%
Uppercase Letter
ValueCountFrequency (%)
M 176132
15.0%
H 128138
10.9%
S 128087
10.9%
B 126589
10.8%
P 112057
9.5%
T 76111
 
6.5%
F 55618
 
4.7%
D 48559
 
4.1%
N 42184
 
3.6%
R 35191
 
3.0%
Other values (11) 244928
20.9%
Space Separator
ValueCountFrequency (%)
536635
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 102812
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7495661
92.1%
Common 639447
 
7.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 778052
 
10.4%
e 676283
 
9.0%
n 628962
 
8.4%
t 551608
 
7.4%
o 531311
 
7.1%
a 527962
 
7.0%
s 469918
 
6.3%
r 443813
 
5.9%
l 292033
 
3.9%
h 266229
 
3.6%
Other values (34) 2329490
31.1%
Common
ValueCountFrequency (%)
536635
83.9%
/ 102812
 
16.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8135108
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 778052
 
9.6%
e 676283
 
8.3%
n 628962
 
7.7%
t 551608
 
6.8%
536635
 
6.6%
o 531311
 
6.5%
a 527962
 
6.5%
s 469918
 
5.8%
r 443813
 
5.5%
l 292033
 
3.6%
Other values (36) 2698531
33.2%

Supervisor District
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct11
Distinct (%)< 0.1%
Missing32624
Missing (%)5.3%
Infinite0
Infinite (%)0.0%
Mean5.9612223
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 MiB
2023-04-20T13:49:39.903340image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median6
Q38
95-th percentile10
Maximum11
Range10
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.8032855
Coefficient of variation (CV)0.47025347
Kurtosis-1.0250774
Mean5.9612223
Median Absolute Deviation (MAD)3
Skewness0.026484574
Sum3447202
Variance7.8584095
MonotonicityNot monotonic
2023-04-20T13:49:40.006340image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
6 135319
22.2%
3 86050
14.1%
10 62768
10.3%
5 58887
9.6%
9 57260
9.4%
2 44144
 
7.2%
8 44095
 
7.2%
1 27563
 
4.5%
7 24609
 
4.0%
11 21215
 
3.5%
(Missing) 32624
 
5.3%
ValueCountFrequency (%)
1 27563
 
4.5%
2 44144
 
7.2%
3 86050
14.1%
4 16361
 
2.7%
5 58887
9.6%
6 135319
22.2%
7 24609
 
4.0%
8 44095
 
7.2%
9 57260
9.4%
10 62768
10.3%
ValueCountFrequency (%)
11 21215
 
3.5%
10 62768
10.3%
9 57260
9.4%
8 44095
 
7.2%
7 24609
 
4.0%
6 135319
22.2%
5 58887
9.6%
4 16361
 
2.7%
3 86050
14.1%
2 44144
 
7.2%

Latitude
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct6458
Distinct (%)1.1%
Missing32624
Missing (%)5.3%
Infinite0
Infinite (%)0.0%
Mean37.769339
Minimum37.707988
Maximum37.829991
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 MiB
2023-04-20T13:49:40.139339image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/

Quantile statistics

Minimum37.707988
5-th percentile37.720724
Q137.755295
median37.775894
Q337.785893
95-th percentile37.802791
Maximum37.829991
Range0.12200249
Interquartile range (IQR)0.03059812

Descriptive statistics

Standard deviation0.024366596
Coefficient of variation (CV)0.00064514224
Kurtosis-0.296199
Mean37.769339
Median Absolute Deviation (MAD)0.01301542
Skewness-0.68109821
Sum21840913
Variance0.00059373098
MonotonicityNot monotonic
2023-04-20T13:49:40.296334image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.78456014 3545
 
0.6%
37.78640961 2974
 
0.5%
37.77516081 2903
 
0.5%
37.78393258 2575
 
0.4%
37.72694991 2414
 
0.4%
37.76505134 2391
 
0.4%
37.77871943 2333
 
0.4%
37.77346692 2113
 
0.3%
37.78258503 2002
 
0.3%
37.77999174 1999
 
0.3%
Other values (6448) 553022
90.5%
(Missing) 32624
 
5.3%
ValueCountFrequency (%)
37.70798826 10
 
< 0.1%
37.70802018 62
< 0.1%
37.70805761 31
 
< 0.1%
37.7082148 6
 
< 0.1%
37.70825596 62
< 0.1%
37.70830771 1
 
< 0.1%
37.70831127 108
< 0.1%
37.70832812 17
 
< 0.1%
37.70835434 22
 
< 0.1%
37.70844468 16
 
< 0.1%
ValueCountFrequency (%)
37.82999075 127
< 0.1%
37.82979158 31
 
< 0.1%
37.8296623 53
< 0.1%
37.82961662 62
< 0.1%
37.82954858 129
< 0.1%
37.82944921 57
< 0.1%
37.82911002 40
 
< 0.1%
37.82908934 77
< 0.1%
37.82834123 25
 
< 0.1%
37.82788815 55
< 0.1%

Longitude
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct6436
Distinct (%)1.1%
Missing32624
Missing (%)5.3%
Infinite0
Infinite (%)0.0%
Mean-122.42392
Minimum-122.51129
Maximum-122.36374
Zeros0
Zeros (%)0.0%
Negative578271
Negative (%)94.7%
Memory size4.7 MiB
2023-04-20T13:49:40.455281image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/

Quantile statistics

Minimum-122.51129
5-th percentile-122.48045
Q1-122.4344
median-122.41771
Q3-122.40729
95-th percentile-122.3911
Maximum-122.36374
Range0.1475521
Interquartile range (IQR)0.0271135

Descriptive statistics

Standard deviation0.026350605
Coefficient of variation (CV)-0.00021524066
Kurtosis1.1965235
Mean-122.42392
Median Absolute Deviation (MAD)0.0125638
Skewness-1.1518327
Sum-70794201
Variance0.00069435436
MonotonicityNot monotonic
2023-04-20T13:49:40.605296image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-122.407337 3545
 
0.6%
-122.4080362 2974
 
0.5%
-122.4036355 2903
 
0.5%
-122.4125953 2575
 
0.4%
-122.4760395 2414
 
0.4%
-122.419669 2391
 
0.4%
-122.4147412 2333
 
0.4%
-122.3914343 2113
 
0.3%
-122.4156939 2002
 
0.3%
-122.4134874 1999
 
0.3%
Other values (6426) 553022
90.5%
(Missing) 32624
 
5.3%
ValueCountFrequency (%)
-122.5112949 1363
0.2%
-122.5103413 20
 
< 0.1%
-122.5101688 99
 
< 0.1%
-122.510037 70
 
< 0.1%
-122.5098948 1137
0.2%
-122.5098792 177
 
< 0.1%
-122.5096222 33
 
< 0.1%
-122.5094329 188
 
< 0.1%
-122.5094022 315
 
0.1%
-122.5093683 21
 
< 0.1%
ValueCountFrequency (%)
-122.3637428 164
 
< 0.1%
-122.36843 52
 
< 0.1%
-122.3690371 46
 
< 0.1%
-122.3691332 3
 
< 0.1%
-122.3695409 11
 
< 0.1%
-122.3696925 32
 
< 0.1%
-122.3703524 84
 
< 0.1%
-122.3707119 54
 
< 0.1%
-122.3708198 46
 
< 0.1%
-122.3712459 448
0.1%

Point
Categorical

HIGH CARDINALITY  MISSING 

Distinct6460
Distinct (%)1.1%
Missing32624
Missing (%)5.3%
Memory size57.3 MiB
POINT (-122.40733704162238 37.784560141211806)
 
3545
POINT (-122.40803623744476 37.78640961281089)
 
2974
POINT (-122.40363551943442 37.7751608100771)
 
2903
POINT (-122.41259527758581 37.7839325760642)
 
2575
POINT (-122.47603947349434 37.72694991292525)
 
2414
Other values (6455)
563860 

Length

Max length46
Median length45
Mean length45.043824
Min length41

Characters and Unicode

Total characters26047537
Distinct characters20
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)< 0.1%

Sample

1st rowPOINT (-122.43362359230944 37.72623624315635)
2nd rowPOINT (-122.42200682265661 37.80549664761133)
3rd rowPOINT (-122.44749724585684 37.77279045274103)
4th rowPOINT (-122.43402709034117 37.78983697125977)
5th rowPOINT (-122.39126842523832 37.73985319897475)

Common Values

ValueCountFrequency (%)
POINT (-122.40733704162238 37.784560141211806) 3545
 
0.6%
POINT (-122.40803623744476 37.78640961281089) 2974
 
0.5%
POINT (-122.40363551943442 37.7751608100771) 2903
 
0.5%
POINT (-122.41259527758581 37.7839325760642) 2575
 
0.4%
POINT (-122.47603947349434 37.72694991292525) 2414
 
0.4%
POINT (-122.41966897380142 37.76505133632968) 2391
 
0.4%
POINT (-122.4147412230519 37.77871942789032) 2333
 
0.4%
POINT (-122.39143433652146 37.773466920607476) 2113
 
0.3%
POINT (-122.41569387441227 37.78258503232177) 2002
 
0.3%
POINT (-122.41348740024354 37.77999173926721) 1999
 
0.3%
Other values (6450) 553022
90.5%
(Missing) 32624
 
5.3%

Length

2023-04-20T13:49:40.760282image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
point 578271
33.3%
37.784560141211806 3545
 
0.2%
122.40733704162238 3545
 
0.2%
122.40803623744476 2974
 
0.2%
37.78640961281089 2974
 
0.2%
122.40363551943442 2903
 
0.2%
37.7751608100771 2903
 
0.2%
122.41259527758581 2575
 
0.1%
37.7839325760642 2575
 
0.1%
122.47603947349434 2414
 
0.1%
Other values (12911) 1130134
65.1%

Most occurring characters

ValueCountFrequency (%)
2 2642702
 
10.1%
7 2621506
 
10.1%
3 2128195
 
8.2%
1 2120749
 
8.1%
4 2023871
 
7.8%
8 1609631
 
6.2%
6 1522751
 
5.8%
5 1511264
 
5.8%
9 1481118
 
5.7%
0 1446498
 
5.6%
Other values (10) 6939252
26.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19108285
73.4%
Uppercase Letter 2891355
 
11.1%
Other Punctuation 1156542
 
4.4%
Space Separator 1156542
 
4.4%
Dash Punctuation 578271
 
2.2%
Open Punctuation 578271
 
2.2%
Close Punctuation 578271
 
2.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2642702
13.8%
7 2621506
13.7%
3 2128195
11.1%
1 2120749
11.1%
4 2023871
10.6%
8 1609631
8.4%
6 1522751
8.0%
5 1511264
7.9%
9 1481118
7.8%
0 1446498
7.6%
Uppercase Letter
ValueCountFrequency (%)
O 578271
20.0%
T 578271
20.0%
N 578271
20.0%
I 578271
20.0%
P 578271
20.0%
Other Punctuation
ValueCountFrequency (%)
. 1156542
100.0%
Space Separator
ValueCountFrequency (%)
1156542
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 578271
100.0%
Open Punctuation
ValueCountFrequency (%)
( 578271
100.0%
Close Punctuation
ValueCountFrequency (%)
) 578271
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 23156182
88.9%
Latin 2891355
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2642702
11.4%
7 2621506
11.3%
3 2128195
9.2%
1 2120749
9.2%
4 2023871
8.7%
8 1609631
 
7.0%
6 1522751
 
6.6%
5 1511264
 
6.5%
9 1481118
 
6.4%
0 1446498
 
6.2%
Other values (5) 4047897
17.5%
Latin
ValueCountFrequency (%)
O 578271
20.0%
T 578271
20.0%
N 578271
20.0%
I 578271
20.0%
P 578271
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26047537
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2642702
 
10.1%
7 2621506
 
10.1%
3 2128195
 
8.2%
1 2120749
 
8.1%
4 2023871
 
7.8%
8 1609631
 
6.2%
6 1522751
 
5.8%
5 1511264
 
5.8%
9 1481118
 
5.7%
0 1446498
 
5.6%
Other values (10) 6939252
26.6%

Neighborhoods
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct116
Distinct (%)< 0.1%
Missing45029
Missing (%)7.4%
Infinite0
Infinite (%)0.0%
Mean52.944906
Minimum1
Maximum117
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 MiB
2023-04-20T13:49:40.903276image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9
Q123
median48
Q386
95-th percentile107
Maximum117
Range116
Interquartile range (IQR)63

Descriptive statistics

Standard deviation32.628641
Coefficient of variation (CV)0.61627537
Kurtosis-1.2351318
Mean52.944906
Median Absolute Deviation (MAD)28
Skewness0.39822005
Sum29959722
Variance1064.6282
MonotonicityNot monotonic
2023-04-20T13:49:41.057290image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
32 56186
 
9.2%
53 46820
 
7.7%
20 36156
 
5.9%
19 21292
 
3.5%
21 18392
 
3.0%
86 12860
 
2.1%
99 12502
 
2.0%
50 11173
 
1.8%
54 10919
 
1.8%
39 10832
 
1.8%
Other values (106) 328734
53.8%
(Missing) 45029
 
7.4%
ValueCountFrequency (%)
1 620
 
0.1%
2 320
 
0.1%
3 490
 
0.1%
4 772
 
0.1%
5 10399
1.7%
6 716
 
0.1%
7 177
 
< 0.1%
8 10261
1.7%
9 4554
0.7%
10 1209
 
0.2%
ValueCountFrequency (%)
117 347
 
0.1%
116 395
 
0.1%
115 2251
 
0.4%
114 806
 
0.1%
113 749
 
0.1%
112 2691
 
0.4%
111 348
 
0.1%
110 706
 
0.1%
109 4764
0.8%
108 10421
1.7%

ESNCAG - Boundary File
Categorical

CONSTANT  MISSING 

Distinct1
Distinct (%)< 0.1%
Missing604133
Missing (%)98.9%
Memory size23.4 MiB
1.0
6762 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters20286
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 6762
 
1.1%
(Missing) 604133
98.9%

Length

2023-04-20T13:49:41.197263image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-20T13:49:41.455287image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
ValueCountFrequency (%)
1.0 6762
100.0%

Most occurring characters

ValueCountFrequency (%)
1 6762
33.3%
. 6762
33.3%
0 6762
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 13524
66.7%
Other Punctuation 6762
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 6762
50.0%
0 6762
50.0%
Other Punctuation
ValueCountFrequency (%)
. 6762
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 20286
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 6762
33.3%
. 6762
33.3%
0 6762
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20286
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 6762
33.3%
. 6762
33.3%
0 6762
33.3%
Distinct1
Distinct (%)< 0.1%
Missing532645
Missing (%)87.2%
Memory size24.8 MiB
1.0
78250 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters234750
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 78250
 
12.8%
(Missing) 532645
87.2%

Length

2023-04-20T13:49:41.546288image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-20T13:49:41.666288image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
ValueCountFrequency (%)
1.0 78250
100.0%

Most occurring characters

ValueCountFrequency (%)
1 78250
33.3%
. 78250
33.3%
0 78250
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 156500
66.7%
Other Punctuation 78250
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 78250
50.0%
0 78250
50.0%
Other Punctuation
ValueCountFrequency (%)
. 78250
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 234750
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 78250
33.3%
. 78250
33.3%
0 78250
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 234750
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 78250
33.3%
. 78250
33.3%
0 78250
33.3%

Civic Center Harm Reduction Project Boundary
Categorical

CONSTANT  MISSING 

Distinct1
Distinct (%)< 0.1%
Missing532929
Missing (%)87.2%
Memory size24.8 MiB
1.0
77966 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters233898
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 77966
 
12.8%
(Missing) 532929
87.2%

Length

2023-04-20T13:49:41.761286image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-20T13:49:41.879301image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
ValueCountFrequency (%)
1.0 77966
100.0%

Most occurring characters

ValueCountFrequency (%)
1 77966
33.3%
. 77966
33.3%
0 77966
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 155932
66.7%
Other Punctuation 77966
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 77966
50.0%
0 77966
50.0%
Other Punctuation
ValueCountFrequency (%)
. 77966
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 233898
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 77966
33.3%
. 77966
33.3%
0 77966
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 233898
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 77966
33.3%
. 77966
33.3%
0 77966
33.3%

HSOC Zones as of 2018-06-05
Categorical

HIGH CORRELATION  MISSING 

Distinct5
Distinct (%)< 0.1%
Missing482105
Missing (%)78.9%
Memory size25.8 MiB
1.0
57713 
3.0
50072 
5.0
15941 
4.0
 
2642
2.0
 
2422

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters386370
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3.0
2nd row3.0
3rd row1.0
4th row3.0
5th row3.0

Common Values

ValueCountFrequency (%)
1.0 57713
 
9.4%
3.0 50072
 
8.2%
5.0 15941
 
2.6%
4.0 2642
 
0.4%
2.0 2422
 
0.4%
(Missing) 482105
78.9%

Length

2023-04-20T13:49:41.976271image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-20T13:49:42.112281image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
ValueCountFrequency (%)
1.0 57713
44.8%
3.0 50072
38.9%
5.0 15941
 
12.4%
4.0 2642
 
2.1%
2.0 2422
 
1.9%

Most occurring characters

ValueCountFrequency (%)
. 128790
33.3%
0 128790
33.3%
1 57713
14.9%
3 50072
 
13.0%
5 15941
 
4.1%
4 2642
 
0.7%
2 2422
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 257580
66.7%
Other Punctuation 128790
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 128790
50.0%
1 57713
22.4%
3 50072
 
19.4%
5 15941
 
6.2%
4 2642
 
1.0%
2 2422
 
0.9%
Other Punctuation
ValueCountFrequency (%)
. 128790
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 386370
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 128790
33.3%
0 128790
33.3%
1 57713
14.9%
3 50072
 
13.0%
5 15941
 
4.1%
4 2642
 
0.7%
2 2422
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 386370
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 128790
33.3%
0 128790
33.3%
1 57713
14.9%
3 50072
 
13.0%
5 15941
 
4.1%
4 2642
 
0.7%
2 2422
 
0.6%

Invest In Neighborhoods (IIN) Areas
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing610895
Missing (%)100.0%
Memory size4.7 MiB

Current Supervisor Districts
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct11
Distinct (%)< 0.1%
Missing32728
Missing (%)5.4%
Infinite0
Infinite (%)0.0%
Mean6.6868673
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 MiB
2023-04-20T13:49:42.229276image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median7
Q310
95-th percentile11
Maximum11
Range10
Interquartile range (IQR)7

Descriptive statistics

Standard deviation3.3317477
Coefficient of variation (CV)0.49825241
Kurtosis-1.5116889
Mean6.6868673
Median Absolute Deviation (MAD)3
Skewness-0.20502285
Sum3866126
Variance11.100543
MonotonicityNot monotonic
2023-04-20T13:49:42.337324image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
10 135319
22.2%
3 86050
14.1%
9 62705
10.3%
11 58887
9.6%
2 57260
9.4%
6 44144
 
7.2%
5 44095
 
7.2%
4 27563
 
4.5%
8 24609
 
4.0%
1 21174
 
3.5%
(Missing) 32728
 
5.4%
ValueCountFrequency (%)
1 21174
 
3.5%
2 57260
9.4%
3 86050
14.1%
4 27563
 
4.5%
5 44095
 
7.2%
6 44144
 
7.2%
7 16361
 
2.7%
8 24609
 
4.0%
9 62705
10.3%
10 135319
22.2%
ValueCountFrequency (%)
11 58887
9.6%
10 135319
22.2%
9 62705
10.3%
8 24609
 
4.0%
7 16361
 
2.7%
6 44144
 
7.2%
5 44095
 
7.2%
4 27563
 
4.5%
3 86050
14.1%
2 57260
9.4%

Current Police Districts
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct10
Distinct (%)< 0.1%
Missing33332
Missing (%)5.5%
Infinite0
Infinite (%)0.0%
Mean4.9035188
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 MiB
2023-04-20T13:49:42.447277image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q37
95-th percentile10
Maximum10
Range9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.7440637
Coefficient of variation (CV)0.55961113
Kurtosis-0.92846815
Mean4.9035188
Median Absolute Deviation (MAD)2
Skewness0.32412312
Sum2832091
Variance7.5298854
MonotonicityNot monotonic
2023-04-20T13:49:42.548277image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
6 88142
14.4%
4 82202
13.5%
3 76155
12.5%
1 74353
12.2%
5 55338
9.1%
2 53542
8.8%
9 44047
7.2%
10 43889
7.2%
8 33261
 
5.4%
7 26634
 
4.4%
(Missing) 33332
 
5.5%
ValueCountFrequency (%)
1 74353
12.2%
2 53542
8.8%
3 76155
12.5%
4 82202
13.5%
5 55338
9.1%
6 88142
14.4%
7 26634
 
4.4%
8 33261
 
5.4%
9 44047
7.2%
10 43889
7.2%
ValueCountFrequency (%)
10 43889
7.2%
9 44047
7.2%
8 33261
 
5.4%
7 26634
 
4.4%
6 88142
14.4%
5 55338
9.1%
4 82202
13.5%
3 76155
12.5%
2 53542
8.8%
1 74353
12.2%

Interactions

2023-04-20T13:49:20.251110image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:46.006930image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:48.839844image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:51.675900image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:54.546845image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:57.321885image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:59.964846image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:02.893881image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:05.795881image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:08.548141image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:11.389188image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:14.637102image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:17.476155image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:20.468780image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:46.228886image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:49.051881image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:51.888853image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:54.753895image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:57.523898image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:00.189844image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:03.114896image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:06.010510image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:08.770143image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:11.618102image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:14.851161image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:17.691102image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:20.690773image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:46.447881image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:49.264881image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:52.100881image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:54.968844image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:57.726857image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:00.416896image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:03.343895image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:06.227141image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:08.996125image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:11.860168image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:15.078101image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:17.909138image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:20.910735image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:46.666896image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:49.482881image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:52.320881image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:55.173848image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:57.927883image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:00.644844image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:03.577891image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:06.448102image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:09.216101image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:12.089116image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:15.310148image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:18.133154image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:21.108736image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:46.874884image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:49.684898image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:52.525884image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:55.373902image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:58.123892image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:00.854889image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:03.788881image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:06.644101image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:09.418162image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:12.302143image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:15.509102image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:18.333102image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:21.326796image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:47.091844image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:49.906844image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:52.742896image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:55.585896image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:58.331885image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:01.070846image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:04.010845image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:06.849120image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:09.636102image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:12.529101image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:15.724139image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:18.548102image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:21.546750image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:47.314844image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:50.138844image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:52.972909image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:55.810844image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:58.536884image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:01.302884image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:04.240859image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:07.071142image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:09.857142image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:12.762101image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:15.950102image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:18.763101image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:21.772791image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:47.534843image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:50.363899image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:53.203903image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:56.033896image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:58.749843image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:01.537881image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:04.469884image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:07.288101image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:10.089156image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:12.987113image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:16.174102image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:18.988149image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:21.983497image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:47.748900image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:50.580890image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:53.417858image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:56.246890image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:58.938898image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:01.758899image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:04.684884image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:07.487102image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:10.299157image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:13.208113image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:16.385138image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:19.193102image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:22.196483image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:47.965889image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:50.800845image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:53.657897image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:56.472843image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:59.136885image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:01.986881image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:04.904844image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:07.701140image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:10.512153image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:13.430101image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:16.604147image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:19.398113image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:22.430519image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:48.196843image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:51.031885image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:53.888883image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:56.700896image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:59.354895image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:02.229881image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:05.138882image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:07.928149image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:10.743102image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:13.967149image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:16.831152image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:19.625138image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:22.649520image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:48.415898image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:51.250844image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:54.113899image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:56.915884image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:59.558894image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:02.457901image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:05.364843image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:08.141102image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:10.963140image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:14.200161image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:17.050102image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:19.836159image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:22.849535image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:48.632883image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:51.464880image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:54.332845image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:57.126899image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:48:59.753843image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:02.677843image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:05.577895image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:08.340101image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:11.178101image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:14.419113image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:17.260160image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
2023-04-20T13:49:20.044129image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/

Correlations

2023-04-20T13:49:42.679275image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Incident YearRow IDIncident IDIncident NumberCAD NumberIncident CodeCNNSupervisor DistrictLatitudeLongitudeNeighborhoodsCurrent Supervisor DistrictsCurrent Police DistrictsIncident Day of WeekReport Type CodeReport Type DescriptionIncident CategoryIncident SubcategoryResolutionPolice DistrictAnalysis NeighborhoodHSOC Zones as of 2018-06-05
Incident Year1.0000.9730.9730.9690.972-0.0140.0050.005-0.012-0.0160.0090.0110.0040.0050.0500.0420.0670.0760.0460.0300.0420.029
Row ID0.9731.0001.0000.9810.998-0.0140.0070.002-0.011-0.0160.0080.0110.0050.0060.0530.0450.0530.0610.0490.0270.0370.033
Incident ID0.9731.0001.0000.9810.998-0.0140.0070.002-0.011-0.0160.0080.0110.0050.0060.0530.0450.0530.0610.0490.0270.0370.033
Incident Number0.9690.9810.9811.0000.997-0.0400.020-0.0120.003-0.0250.0100.0120.0080.0020.0500.0420.0840.1140.0460.0260.0330.013
CAD Number0.9720.9980.9980.9971.000-0.0230.0020.009-0.013-0.0100.0080.016-0.0040.0040.0630.0630.0710.0770.1510.0230.0300.018
Incident Code-0.014-0.014-0.014-0.040-0.0231.000-0.0580.071-0.0700.023-0.0110.015-0.0140.0190.1630.2070.8240.8240.2430.0920.0920.087
CNN0.0050.0070.0070.0200.002-0.0581.000-0.6240.482-0.362-0.2410.1470.1840.0080.0730.0690.0860.1070.0440.4900.6530.163
Supervisor District0.0050.0020.002-0.0120.0090.071-0.6241.000-0.7850.2750.291-0.055-0.2900.0120.0880.0960.1170.1310.0930.6590.8880.817
Latitude-0.012-0.011-0.0110.003-0.013-0.0700.482-0.7851.0000.149-0.1920.094-0.0450.0110.0840.0830.0870.1070.0600.4720.7620.603
Longitude-0.016-0.016-0.016-0.025-0.0100.023-0.3620.2750.1491.0000.1340.091-0.6250.0090.0490.0570.0870.0950.0840.4720.6740.739
Neighborhoods0.0090.0080.0080.0100.008-0.011-0.2410.291-0.1920.1341.000-0.2010.0060.0080.0780.0760.0920.1050.0800.6010.7550.763
Current Supervisor Districts0.0110.0110.0110.0120.0160.0150.147-0.0550.0940.091-0.2011.000-0.2980.0130.0870.0970.1160.1310.0920.6690.8870.817
Current Police Districts0.0040.0050.0050.008-0.004-0.0140.184-0.290-0.045-0.6250.006-0.2981.0000.0130.0950.1020.1260.1350.1010.9420.8650.629
Incident Day of Week0.0050.0060.0060.0020.0040.0190.0080.0120.0110.0090.0080.0130.0131.0000.0240.0270.0320.0340.0210.0150.0180.019
Report Type Code0.0500.0530.0530.0500.0630.1630.0730.0880.0840.0490.0780.0870.0950.0241.0001.0000.7010.7530.1370.2010.1050.045
Report Type Description0.0420.0450.0450.0420.0630.2070.0690.0960.0830.0570.0760.0970.1020.0271.0001.0000.6090.6650.2200.1750.1200.059
Incident Category0.0670.0530.0530.0840.0710.8240.0860.1170.0870.0870.0920.1160.1260.0320.7010.6091.0000.8710.4400.1770.0760.145
Incident Subcategory0.0760.0610.0610.1140.0770.8240.1070.1310.1070.0950.1050.1310.1350.0340.7530.6650.8711.0000.4160.1840.0830.149
Resolution0.0460.0490.0490.0460.1510.2430.0440.0930.0600.0840.0800.0920.1010.0210.1370.2200.4400.4161.0000.1080.1240.057
Police District0.0300.0270.0270.0260.0230.0920.4900.6590.4720.4720.6010.6690.9420.0150.2010.1750.1770.1840.1081.0000.7990.655
Analysis Neighborhood0.0420.0370.0370.0330.0300.0920.6530.8880.7620.6740.7550.8870.8650.0180.1050.1200.0760.0830.1240.7991.0000.862
HSOC Zones as of 2018-06-050.0290.0330.0330.0130.0180.0870.1630.8170.6030.7390.7630.8170.6290.0190.0450.0590.1450.1490.0570.6550.8621.000

Missing values

2023-04-20T13:49:25.091536image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-04-20T13:49:27.701539image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-04-20T13:49:32.495560image/svg+xmlMatplotlib v3.6.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Incident DatetimeIncident DateIncident TimeIncident YearIncident Day of WeekReport DatetimeRow IDIncident IDIncident NumberCAD NumberReport Type CodeReport Type DescriptionFiled OnlineIncident CodeIncident CategoryIncident SubcategoryIncident DescriptionResolutionIntersectionCNNPolice DistrictAnalysis NeighborhoodSupervisor DistrictLatitudeLongitudePointNeighborhoodsESNCAG - Boundary FileCentral Market/Tenderloin Boundary Polygon - UpdatedCivic Center Harm Reduction Project BoundaryHSOC Zones as of 2018-06-05Invest In Neighborhoods (IIN) AreasCurrent Supervisor DistrictsCurrent Police Districts
025-07-2021 00:0025-07-202100:002021Sunday25-07-2021 13:411.057190e+111057189216105573NaNIICoplogic InitialTrue6372Larceny TheftLarceny Theft - OtherTheft, Other Property, $50-$200Open or ActiveNaNNaNSouthernNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
128-06-2022 23:5828-06-202223:582022Tuesday28-06-2022 23:581.165540e+111165543220264913NaNVSVehicle SupplementNaN71012Other OffensesOther OffensesLicense Plate, RecoveredOpen or ActiveNaNNaNOut of SFNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
211-03-2022 10:3011-03-202210:302022Friday11-03-2022 20:031.130480e+111130480226040232NaNIICoplogic InitialTrue71000Lost PropertyLost PropertyLost PropertyOpen or ActiveNaNNaNCentralNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
315-05-2021 17:4715-05-202117:472021Saturday15-05-2021 17:471.030520e+111030518210183345NaNVSVehicle SupplementNaN7043Recovered VehicleRecovered VehicleVehicle, Recovered, MotorcycleOpen or ActiveNaNNaNOut of SFNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
428-06-2022 17:2228-06-202217:222022Tuesday28-06-2022 17:221.165350e+111165351220361741NaNVSVehicle SupplementNaN7041Recovered VehicleRecovered VehicleVehicle, Recovered, AutoOpen or ActiveNaNNaNOut of SFNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
518-11-2021 13:3018-11-202113:302021Thursday18-11-2021 16:241.094350e+111094352216178097NaNIICoplogic InitialTrue28150Malicious MischiefVandalismMalicious Mischief, Vandalism to PropertyOpen or ActiveNaNNaNMissionNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
628-06-2022 14:0028-06-202214:002022Tuesday28-06-2022 15:091.165460e+111165462226109026NaNIICoplogic InitialTrue6244Larceny TheftLarceny - From VehicleTheft, From Locked Vehicle, >$950Open or ActiveNaNNaNRichmondNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
707-08-2022 22:0007-08-202222:002022Sunday16-08-2022 15:261.182770e+111182772226147327NaNIICoplogic InitialTrue6374Larceny TheftLarceny Theft - OtherTheft, Other Property, >$950Open or ActiveNaNNaNRichmondNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
804-05-2022 09:3804-05-202209:382022Wednesday04-05-2022 09:391.147130e+111147129210618041NaNVSVehicle SupplementNaN7041Recovered VehicleRecovered VehicleVehicle, Recovered, AutoOpen or ActiveNaNNaNOut of SFNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
911-05-2018 17:3011-05-201817:302018Friday13-05-2018 13:506.691847e+10669184186110767NaNIICoplogic InitialTrue71000Lost PropertyLost PropertyLost PropertyOpen or ActiveNaNNaNOut of SFNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
Incident DatetimeIncident DateIncident TimeIncident YearIncident Day of WeekReport DatetimeRow IDIncident IDIncident NumberCAD NumberReport Type CodeReport Type DescriptionFiled OnlineIncident CodeIncident CategoryIncident SubcategoryIncident DescriptionResolutionIntersectionCNNPolice DistrictAnalysis NeighborhoodSupervisor DistrictLatitudeLongitudePointNeighborhoodsESNCAG - Boundary FileCentral Market/Tenderloin Boundary Polygon - UpdatedCivic Center Harm Reduction Project BoundaryHSOC Zones as of 2018-06-05Invest In Neighborhoods (IIN) AreasCurrent Supervisor DistrictsCurrent Police Districts
61088501-08-2020 14:5101-08-202014:512020Saturday01-08-2020 14:529.490955e+10949095200460567202141636.0IIInitialNaN51040Non-CriminalNon-CriminalAided CaseOpen or Active20TH AVE \ TARAVAL ST23195000.0TaravalSunset/Parkside4.037.743003-122.476765POINT (-122.47676460087209 37.74300263165964)40.0NaNNaNNaNNaNNaN7.010.0
61088613-08-2020 15:0013-08-202015:002020Thursday14-08-2020 09:559.527101e+10952710200487668202271002.0VIVehicle InitialNaN7023Motor Vehicle TheftMotor Vehicle TheftVehicle, Stolen, MotorcycleOpen or ActiveMANGELS AVE \ BURNSIDE AVE21978000.0InglesideWest of Twin Peaks8.037.733087-122.438816POINT (-122.43881594927997 37.733086632346776)95.0NaNNaNNaNNaNNaN5.09.0
61088726-09-2020 15:1426-09-202015:142020Saturday26-09-2020 15:169.646401e+10964640200580234202702112.0IIInitialNaN6153Larceny TheftLarceny Theft - OtherTheft, From Person, $200-$950 (other than Pickpocket)Open or ActiveBEALE ST \ MISSION ST24554000.0CentralFinancial District/South Beach6.037.791153-122.395813POINT (-122.39581342280272 37.791152807557935)108.0NaNNaNNaNNaNNaN10.01.0
61088810-06-2020 16:0010-06-202016:002020Wednesday10-06-2020 16:009.343903e+10934390200328307NaNISInitial SupplementNaN28150Malicious MischiefVandalismMalicious Mischief, Vandalism to PropertyOpen or ActiveTEHAMA ST \ GALLAGHER LN28147000.0SouthernSouth of Market6.037.781929-122.403328POINT (-122.40332753664748 37.78192890777912)32.0NaNNaNNaNNaNNaN10.01.0
61088920-10-2020 10:4020-10-202010:402020Tuesday20-10-2020 10:419.733497e+10973349200620040200620040.0ISInitial SupplementNaN71013Larceny TheftTheft From VehicleLicense Plate, StolenOpen or Active04TH ST \ LONG BRIDGE ST34168000.0Out of SFMission Bay6.037.773467-122.391434POINT (-122.39143433652146 37.773466920607476)34.0NaNNaNNaNNaNNaN10.01.0
61089004-12-2020 00:0004-12-202000:002020Friday05-12-2020 10:449.843021e+10984302200733182203400924.0VIVehicle InitialNaN7021Motor Vehicle TheftMotor Vehicle TheftVehicle, Stolen, AutoOpen or ActiveBAY SHORE BLVD \ COSGROVE ST33284000.0BayviewBayview Hunters Point9.037.742392-122.405838POINT (-122.40583815976386 37.74239176061754)82.0NaNNaNNaNNaNNaN2.02.0
61089120-09-2020 00:2120-09-202000:212020Sunday20-09-2020 00:219.628556e+10962855200566159202640065.0IIInitialNaN64085Other MiscellaneousOtherInvestigative DetentionCite or Arrest AdultELLIS ST \ LARKIN ST25149000.0TenderloinTenderloin6.037.784236-122.417707POINT (-122.4177067508564 37.78423573864025)20.0NaN1.0NaNNaNNaN10.05.0
61089217-09-2020 06:1517-09-202006:152020Thursday17-09-2020 08:519.623941e+10962394206135336NaNIICoplogic InitialTrue6374Larceny TheftLarceny Theft - OtherTheft, Other Property, >$950Open or Active46TH AVE \ IRVING ST27949000.0TaravalSunset/Parkside4.037.762285-122.506059POINT (-122.50605907625517 37.76228499654453)39.0NaNNaNNaNNaNNaN7.010.0
61089308-08-2020 01:0008-08-202001:002020Saturday18-08-2020 16:539.544413e+10954441206123773NaNIICoplogic InitialTrue28150Malicious MischiefVandalismMalicious Mischief, Vandalism to PropertyOpen or Active02ND ST \ NATOMA ST24543000.0SouthernFinancial District/South Beach6.037.787203-122.398790POINT (-122.39878960122489 37.787203462687714)32.0NaNNaNNaNNaNNaN10.01.0
61089417-01-2021 15:0017-01-202115:002021Sunday17-01-2021 15:009.969712e+10996971210038126210171798.0IIInitialNaN19057Disorderly ConductIntimidationTerrorist ThreatsOpen or Active04TH ST \ LONG BRIDGE ST34168000.0SouthernMission Bay6.037.773467-122.391434POINT (-122.39143433652146 37.773466920607476)34.0NaNNaNNaNNaNNaN10.01.0